929 research outputs found
Finding Rhythm in Speech: A Response to Cummins
This paper attempts to address three critical questions left unanswered by Cummins’ review: are rhythm and entrainment physical, perceptual or social phenomena, what are the underlying mechanisms, and what is their role in behaviour such as speech and music? These issues are addressed from the perspective of an engineer/computer-scientist/ roboticist for whom modelling such behaviours within a computational framework not only provides an empirical methodology for validating theoretical claims, but also facilitates the construction of artificial devices that are capable of exhibiting/exploiting those behaviours in the context of human-machine interaction. The paper draws on insights from a range of different perspectives, and attempts to weave them together within a coherent theoretical framework. It is concluded that (i) rhythm and entrainment are phenomena that emerge naturally from the structural coupling within and between even simple systems, (ii) living systems have evolved very effective mechanisms for managing such behaviours for intrinsic and extrinsic gains, and (iii) the fields of energetics and information theory provide the appropriate tools for analysing and characterising such behaviour within a general theoretical framework. It is hoped that these insights will inspire future cross- disciplinary research in these areas, and lead to a deeper understanding of these fundamental behaviours
Adapting the NICT-JLE Corpus for Disfluency Detection Models
The detection of disfluencies such as hesitations, repetitions and false
starts commonly found in speech is a widely studied area of research. With a
standardised process for evaluation using the Switchboard Corpus, model
performance can be easily compared across approaches. This is not the case for
disfluency detection research on learner speech, however, where such datasets
have restricted access policies, making comparison and subsequent development
of improved models more challenging. To address this issue, this paper
describes the adaptation of the NICT-JLE corpus, containing approximately 300
hours of English learners' oral proficiency tests, to a format that is suitable
for disfluency detection model training and evaluation. Points of difference
between the NICT-JLE and Switchboard corpora are explored, followed by a
detailed overview of adaptations to the tag set and meta-features of the
NICT-JLE corpus. The result of this work provides a standardised train, heldout
and test set for use in future research on disfluency detection for learner
speech
Time-evolving a matrix product state with long-ranged interactions
We introduce a numerical algorithm to simulate the time evolution of a matrix
product state under a long-ranged Hamiltonian. In the effectively
one-dimensional representation of a system by matrix product states,
long-ranged interactions are necessary to simulate not just many physical
interactions but also higher-dimensional problems with short-ranged
interactions. Since our method overcomes the restriction to short-ranged
Hamiltonians of most existing methods, it proves particularly useful for
studying the dynamics of both power-law interacting one-dimensional systems,
such as Coulombic and dipolar systems, and quasi two-dimensional systems, such
as strips or cylinders. First, we benchmark the method by verifying a
long-standing theoretical prediction for the dynamical correlation functions of
the Haldane-Shastry model. Second, we simulate the time evolution of an
expanding cloud of particles in the two-dimensional Bose-Hubbard model, a
subject of several recent experiments.Comment: 5 pages + 3 pages appendices, 4 figure
A silent speech system based on permanent magnet articulography and direct synthesis
In this paper we present a silent speech interface (SSI) system aimed at restoring speech communication for individuals who have lost their voice due to laryngectomy or diseases affecting the vocal folds. In the proposed system, articulatory data captured from the lips and tongue using permanent magnet articulography (PMA) are converted into audible speech using a speaker-dependent transformation learned from simultaneous recordings of PMA and audio signals acquired before laryngectomy. The transformation is represented using a mixture of factor analysers, which is a generative model that allows us to efficiently model non-linear behaviour and perform dimensionality reduction at the same time. The learned transformation is then deployed during normal usage of the SSI to restore the acoustic speech signal associated with the captured PMA data. The proposed system is evaluated using objective quality measures and listening tests on two databases containing PMA and audio recordings for normal speakers. Results show that it is possible to reconstruct speech from articulator movements captured by an unobtrusive technique without an intermediate recognition step. The SSI is capable of producing speech of sufficient intelligibility and naturalness that the speaker is clearly identifiable, but problems remain in scaling up the process to function consistently for phonetically rich vocabularies
Integrating user-centred design in the development of a silent speech interface based on permanent magnetic articulography
Abstract: A new wearable silent speech interface (SSI) based on Permanent Magnetic Articulography (PMA) was developed with the involvement of end users in the design process. Hence, desirable features such as appearance, port-ability, ease of use and light weight were integrated into the prototype. The aim of this paper is to address the challenges faced and the design considerations addressed during the development. Evaluation on both hardware and speech recognition performances are presented here. The new prototype shows a com-parable performance with its predecessor in terms of speech recognition accuracy (i.e. ~95% of word accuracy and ~75% of sequence accuracy), but significantly improved appearance, portability and hardware features in terms of min-iaturization and cost
Usability, Acceptability, and Effectiveness of Web-Based Conversational Agents to Facilitate Problem Solving in Older Adults: Controlled Study.
BACKGROUND: The usability and effectiveness of conversational agents (chatbots) that deliver psychological therapies is under-researched. OBJECTIVE: This study aimed to compare the system usability, acceptability, and effectiveness in older adults of 2 Web-based conversational agents that differ in theoretical orientation and approach. METHODS: In a randomized study, 112 older adults were allocated to 1 of the following 2 fully automated interventions: Manage Your Life Online (MYLO; ie, a chatbot that mimics a therapist using a method of levels approach) and ELIZA (a chatbot that mimics a therapist using a humanistic counseling approach). The primary outcome was problem distress and resolution, with secondary outcome measures of system usability and clinical outcome. RESULTS: MYLO participants spent significantly longer interacting with the conversational agent. Posthoc tests indicated that MYLO participants had significantly lower problem distress at follow-up. There were no differences between MYLO and ELIZA in terms of problem resolution. MYLO was rated as significantly more helpful and likely to be used again. System usability of both the conversational agents was associated with helpfulness of the agents and the willingness of the participants to reuse. Adherence was high. A total of 12% (7/59) of the MYLO group did not carry out their conversation with the chatbot. CONCLUSIONS: Controlled studies of chatbots need to be conducted in clinical populations across different age groups. The potential integration of chatbots into psychological care in routine services is discussed
Quantum transport and two-parameter scaling at the surface of a weak topological insulator
Weak topological insulators have an even number of Dirac cones in their
surface spectrum and are thought to be unstable to disorder, which leads to an
insulating surface. Here we argue that the presence of disorder alone will not
localize the surface states, rather; the presence of a time-reversal symmetric
mass term is required for localization. Through numerical simulations, we show
that in the absence of the mass term the surface always flow to a stable
metallic phase and the conductivity obeys a one-parameter scaling relation,
just as in the case of a strong topological insulator surface. With the
inclusion of the mass, the transport properties of the surface of a weak
topological insulator follow a two-parameter scaling form.Comment: 4 pages + Appendices, v2 added conductance distributio
Acceptability and Effectiveness of NHS-Recommended e-Therapies for Depression, Anxiety, and Stress: Meta-Analysis.
BACKGROUND: There is a disconnect between the ability to swiftly develop e-therapies for the treatment of depression, anxiety, and stress, and the scrupulous evaluation of their clinical utility. This creates a risk that the e-therapies routinely provided within publicly funded psychological health care have evaded appropriate rigorous evaluation in their development. OBJECTIVE: This study aims to conduct a meta-analytic review of the gold standard evidence of the acceptability and clinical effectiveness of e-therapies recommended for use in the National Health Service (NHS) in the United Kingdom. METHODS: Systematic searches identified appropriate randomized controlled trials (RCTs). Depression, anxiety, and stress outcomes at the end of treatment and follow-up were synthesized using a random-effects meta-analysis. The grading of recommendations assessment, development, and evaluation approach was used to assess the quality of each meta-analytic comparison. Moderators of treatment effect were examined using subgroup and meta-regression analysis. Dropout rates for e-therapies (as a proxy for acceptability) were compared against controls. RESULTS: A total of 24 studies evaluating 7 of 48 NHS-recommended e-therapies were qualitatively and quantitatively synthesized. Depression, anxiety, and stress outcomes for e-therapies were superior to controls (depression: standardized mean difference [SMD] 0.38, 95% CI 0.24 to 0.52, N=7075; anxiety and stress: SMD 0.43, 95% CI 0.24 to 0.63, n=4863), and these small effects were maintained at follow-up. Average dropout rates for e-therapies (31%, SD 17.35) were significantly higher than those of controls (17%, SD 13.31). Limited moderators of the treatment effect were found. CONCLUSIONS: Many NHS-recommended e-therapies have not been through an RCT-style evaluation. The e-therapies that have been appropriately evaluated generate small but significant, durable, beneficial treatment effects. TRIAL REGISTRATION: International Prospective Register of Systematic Reviews (PROSPERO) registration CRD42019130184; https://www.crd.york.ac.uk/prospero/display_record.php?RecordID=130184
- …